Edinburg
MTBench: A Multimodal Time Series Benchmark for Temporal Reasoning and Question Answering
Chen, Jialin, Feng, Aosong, Zhao, Ziyu, Garza, Juan, Nurbek, Gaukhar, Qin, Cheng, Maatouk, Ali, Tassiulas, Leandros, Gao, Yifeng, Ying, Rex
Understanding the relationship between textual news and time-series evolution is a critical yet under-explored challenge in applied data science. While multimodal learning has gained traction, existing multimodal time-series datasets fall short in evaluating cross-modal reasoning and complex question answering, which are essential for capturing complex interactions between narrative information and temporal patterns. To bridge this gap, we introduce Multimodal Time Series Benchmark (MTBench), a large-scale benchmark designed to evaluate large language models (LLMs) on time series and text understanding across financial and weather domains. MTbench comprises paired time series and textual data, including financial news with corresponding stock price movements and weather reports aligned with historical temperature records. Unlike existing benchmarks that focus on isolated modalities, MTbench provides a comprehensive testbed for models to jointly reason over structured numerical trends and unstructured textual narratives. The richness of MTbench enables formulation of diverse tasks that require a deep understanding of both text and time-series data, including time-series forecasting, semantic and technical trend analysis, and news-driven question answering (QA). These tasks target the model's ability to capture temporal dependencies, extract key insights from textual context, and integrate cross-modal information. We evaluate state-of-the-art LLMs on MTbench, analyzing their effectiveness in modeling the complex relationships between news narratives and temporal patterns. Our findings reveal significant challenges in current models, including difficulties in capturing long-term dependencies, interpreting causality in financial and weather trends, and effectively fusing multimodal information.
Exploring Transfer Learning for Deep Learning Polyp Detection in Colonoscopy Images Using YOLOv8
Vazquez, Fabian, Nuñez, Jose Angel, Fu, Xiaoyan, Gu, Pengfei, Fu, Bin
Deep learning methods have demonstrated strong performance in objection tasks; however, their ability to learn domain-specific applications with limited training data remains a significant challenge. Transfer learning techniques address this issue by leveraging knowledge from pre-training on related datasets, enabling faster and more efficient learning for new tasks. Finding the right dataset for pre-training can play a critical role in determining the success of transfer learning and overall model performance. In this paper, we investigate the impact of pre-training a YOLOv8n model on seven distinct datasets, evaluating their effectiveness when transferred to the task of polyp detection. We compare whether large, general-purpose datasets with diverse objects outperform niche datasets with characteristics similar to polyps. In addition, we assess the influence of the size of the dataset on the efficacy of transfer learning. Experiments on the polyp datasets show that models pre-trained on relevant datasets consistently outperform those trained from scratch, highlighting the benefit of pre-training on datasets with shared domain-specific features.
Deep Learning for Early Alzheimer Disease Detection with MRI Scans
Rafsan, Mohammad, Oraby, Tamer, Roy, Upal, Kumar, Sanjeev, Rodrigo, Hansapani
Alzheimer's Disease is a neurodegenerative condition characterized by dementia and impairment in neurological function. The study primarily focuses on the individuals above age 40, affecting their memory, behavior, and cognitive processes of the brain. Alzheimer's disease requires diagnosis by a detailed assessment of MRI scans and neuropsychological tests of the patients. This project compares existing deep learning models in the pursuit of enhancing the accuracy and efficiency of AD diagnosis, specifically focusing on the Convolutional Neural Network, Bayesian Convolutional Neural Network, and the U-net model with the Open Access Series of Imaging Studies brain MRI dataset. Besides, to ensure robustness and reliability in the model evaluations, we address the challenge of imbalance in data. We then perform rigorous evaluation to determine strengths and weaknesses for each model by considering sensitivity, specificity, and computational efficiency. This comparative analysis would shed light on the future role of AI in revolutionizing AD diagnostics but also paved ways for future innovation in medical imaging and the management of neurodegenerative diseases.
SouLLMate: An Application Enhancing Diverse Mental Health Support with Adaptive LLMs, Prompt Engineering, and RAG Techniques
Guo, Qiming, Tang, Jinwen, Sun, Wenbo, Tang, Haoteng, Shang, Yi, Wang, Wenlu
Mental health issues significantly impact individuals' daily lives, yet many do not receive the help they need even with available online resources. This study aims to provide diverse, accessible, stigma-free, personalized, and real-time mental health support through cutting-edge AI technologies. It makes the following contributions: (1) Conducting an extensive survey of recent mental health support methods to identify prevalent functionalities and unmet needs. (2) Introducing SouLLMate, an adaptive LLM-driven system that integrates LLM technologies, Chain, Retrieval-Augmented Generation (RAG), prompt engineering, and domain knowledge. This system offers advanced features such as Risk Detection and Proactive Guidance Dialogue, and utilizes RAG for personalized profile uploads and Conversational Information Extraction. (3) Developing novel evaluation approaches for preliminary assessments and risk detection via professionally annotated interview data and real-life suicide tendency data. (4) Proposing the Key Indicator Summarization (KIS), Proactive Questioning Strategy (PQS), and Stacked Multi-Model Reasoning (SMMR) methods to enhance model performance and usability through context-sensitive response adjustments, semantic coherence evaluations, and enhanced accuracy of long-context reasoning in language models. This study contributes to advancing mental health support technologies, potentially improving the accessibility and effectiveness of mental health care globally.
COSCO: A Sharpness-Aware Training Framework for Few-shot Multivariate Time Series Classification
Barreda, Jesus, Gomez, Ashley, Puga, Ruben, Zhou, Kaixiong, Zhang, Li
Multivariate time series classification is an important task with widespread domains of applications. Recently, deep neural networks (DNN) have achieved state-of-the-art performance in time series classification. However, they often require large expert-labeled training datasets which can be infeasible in practice. In few-shot settings, i.e. only a limited number of samples per class are available in training data, DNNs show a significant drop in testing accuracy and poor generalization ability. In this paper, we propose to address these problems from an optimization and a loss function perspective. Specifically, we propose a new learning framework named COSCO consisting of a sharpness-aware minimization (SAM) optimization and a Prototypical loss function to improve the generalization ability of DNN for multivariate time series classification problems under few-shot setting. Our experiments demonstrate our proposed method outperforms the existing baseline methods. Our source code is available at: https://github.com/JRB9/COSCO.
Self Pre-training with Topology- and Spatiality-aware Masked Autoencoders for 3D Medical Image Segmentation
Gu, Pengfei, Zhang, Yejia, Li, Huimin, Wang, Chaoli, Chen, Danny Z.
Masked Autoencoders (MAEs) have been shown to be effective in pre-training Vision Transformers (ViTs) for natural and medical image analysis problems. By reconstructing missing pixel/voxel information in visible patches, a ViT encoder can aggregate contextual information for downstream tasks. But, existing MAE pre-training methods, which were specifically developed with the ViT architecture, lack the ability to capture geometric shape and spatial information, which is critical for medical image segmentation tasks. In this paper, we propose a novel extension of known MAEs for self pre-training (i.e., models pre-trained on the same target dataset) for 3D medical image segmentation. (1) We propose a new topological loss to preserve geometric shape information by computing topological signatures of both the input and reconstructed volumes, learning geometric shape information. (2) We introduce a pre-text task that predicts the positions of the centers and eight corners of 3D crops, enabling the MAE to aggregate spatial information. (3) We extend the MAE pre-training strategy to a hybrid state-of-the-art (SOTA) medical image segmentation architecture and co-pretrain it alongside the ViT. (4) We develop a fine-tuned model for downstream segmentation tasks by complementing the pre-trained ViT encoder with our pre-trained SOTA model. Extensive experiments on five public 3D segmentation datasets show the effectiveness of our new approach.
Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation
Tang, Haoteng, Liu, Guodong, Dai, Siyuan, Ye, Kai, Zhao, Kun, Wang, Wenlu, Yang, Carl, He, Lifang, Leow, Alex, Thompson, Paul, Huang, Heng, Zhan, Liang
The MRI-derived brain network serves as a pivotal instrument in elucidating both the structural and functional aspects of the brain, encompassing the ramifications of diseases and developmental processes. However, prevailing methodologies, often focusing on synchronous BOLD signals from functional MRI (fMRI), may not capture directional influences among brain regions and rarely tackle temporal functional dynamics. In this study, we first construct the brain-effective network via the dynamic causal model. Subsequently, we introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE). This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic interplay between structural and effective networks via an ordinary differential equation (ODE) model, which characterizes spatial-temporal brain dynamics. Our framework is validated on several clinical phenotype prediction tasks using two independent publicly available datasets (HCP and OASIS). The experimental results clearly demonstrate the advantages of our model compared to several state-of-the-art methods.
Biden does not have 'cognitive ability' to serve another term, says former WH doctor
Rep. Ronny Jackson, R-Texas, on the health of the president as Biden turns 81 and Texas Gov. Abbott endorses Trump for the 2024 presidential election The former White House physician for Presidents Obama and Trump expressed concern Monday about President Biden's health and mental acuity as the president turns 81. Rep. Ronny Jackson, R-Texas, said on "FOX & Friends" that the growing concerns, including from the left, are valid. "I've been saying for quite some time now, when he was candidate Joe Biden, that I didn't think that he had the cognitive ability to do the job," said Jackson. Additionally, Jackson emphasized that Biden has "degenerated" over the last three years. "He's got these people that surround him that are inappropriately encouraging him to continue to run because it builds up who they are and what they do. But our border, our wars overseas, our economy, you know, it's just a disaster right now. And he just can't do the job. And it's just on display every day that he's not capable of doing this job anymore," Jackson warned.
Computer Vision for Volunteer Cotton Detection in a Corn Field with UAS Remote Sensing Imagery and Spot Spray Applications
Yadav, Pappu Kumar, Thomasson, J. Alex, Searcy, Stephen W., Hardin, Robert G., Braga-Neto, Ulisses, Popescu, Sorin C., Martin, Daniel E., Rodriguez, Roberto, Meza, Karem, Enciso, Juan, Diaz, Jorge Solorzano, Wang, Tianyi
To control boll weevil (Anthonomus grandis L.) pest re-infestation in cotton fields, the current practices of volunteer cotton (VC) (Gossypium hirsutum L.) plant detection in fields of rotation crops like corn (Zea mays L.) and sorghum (Sorghum bicolor L.) involve manual field scouting at the edges of fields. This leads to many VC plants growing in the middle of fields remain undetected that continue to grow side by side along with corn and sorghum. When they reach pinhead squaring stage (5-6 leaves), they can serve as hosts for the boll weevil pests. Therefore, it is required to detect, locate and then precisely spot-spray them with chemicals. In this paper, we present the application of YOLOv5m on radiometrically and gamma-corrected low resolution (1.2 Megapixel) multispectral imagery for detecting and locating VC plants growing in the middle of tasseling (VT) growth stage of cornfield. Our results show that VC plants can be detected with a mean average precision (mAP) of 79% and classification accuracy of 78% on images of size 1207 x 923 pixels at an average inference speed of nearly 47 frames per second (FPS) on NVIDIA Tesla P100 GPU-16GB and 0.4 FPS on NVIDIA Jetson TX2 GPU. We also demonstrate the application of a customized unmanned aircraft systems (UAS) for spot-spray applications based on the developed computer vision (CV) algorithm and how it can be used for near real-time detection and mitigation of VC plants growing in corn fields for efficient management of the boll weevil pests.
Detecting Volunteer Cotton Plants in a Corn Field with Deep Learning on UAV Remote-Sensing Imagery
Yadav, Pappu Kumar, Thomasson, J. Alex, Hardin, Robert, Searcy, Stephen W., Braga-Neto, Ulisses, Popescu, Sorin C., Martin, Daniel E., Rodriguez, Roberto, Meza, Karem, Enciso, Juan, Diaz, Jorge Solorzano, Wang, Tianyi
The cotton boll weevil, Anthonomus grandis Boheman is a serious pest to the U.S. cotton industry that has cost more than 16 billion USD in damages since it entered the United States from Mexico in the late 1800s. This pest has been nearly eradicated; however, southern part of Texas still faces this issue and is always prone to the pest reinfestation each year due to its sub-tropical climate where cotton plants can grow year-round. Volunteer cotton (VC) plants growing in the fields of inter-seasonal crops, like corn, can serve as hosts to these pests once they reach pin-head square stage (5-6 leaf stage) and therefore need to be detected, located, and destroyed or sprayed . In this paper, we present a study to detect VC plants in a corn field using YOLOv3 on three band aerial images collected by unmanned aircraft system (UAS). The two-fold objectives of this paper were : (i) to determine whether YOLOv3 can be used for VC detection in a corn field using RGB (red, green, and blue) aerial images collected by UAS and (ii) to investigate the behavior of YOLOv3 on images at three different scales (320 x 320, S1; 416 x 416, S2; and 512 x 512, S3 pixels) based on average precision (AP), mean average precision (mAP) and F1-score at 95% confidence level. No significant differences existed for mAP among the three scales, while a significant difference was found for AP between S1 and S3 (p = 0.04) and S2 and S3 (p = 0.02). A significant difference was also found for F1-score between S2 and S3 (p = 0.02). The lack of significant differences of mAP at all the three scales indicated that the trained YOLOv3 model can be used on a computer vision-based remotely piloted aerial application system (RPAAS) for VC detection and spray application in near real-time.